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. Abstract 

Increasingly large numbers of people communicate 
today via electronic means such as email or news jo- 
rums. One of the basic properties of the current elec- 
tronic communication means is the identification of 

lyfhe end-points. However, at times it w desirable or 

— ' even critical to hide the identity and/oj^whereabouts 
of the end-points (e.g.,, human users) involved. 

This paper discusses the goals and desired-proper- 
ties of anonymous email in general and introduces the 
design and salient features of Babel anonymous re- 
1 mailer. Babel allows email usersTocoi i verse electron- 
iZalTij while remaining anonymous with respect to each 
otheFand to other - even hostile - parties. A range 
ofaltacks and corresponding countermeasures is con- 
A sidered. An attempt is made to formalize and quan- 
tify certain dimensions of anonymity and untraceable 
r Communication. 

Keywords: security, email, mix, anonymity, un- 
traceability, tragic analysts, remailer 

1 Introduction 

Explosive growth and proliferation of the global 
Internet in the past decade allowed millions of peo- 
ple to communicate via electronic mail. In many re- 
spects, email is rapidly replacing traditional paper 
mail. Email is not only fast and convenient but also - 
at least for the time being - free of charge for a large 
segment of users. 

There are, however, some aspects of email that can 
be improved upon. First, most of today's Internet 
email is not very secure. Sender authentication, non- 
repudiation, data integrity and privacy are some of the 
basic ingredients of secure email. While basic email se- 
curity is addressed to some extent by recent offerings 
such as POP [35] and PEM [16], their acceptance is far 
from universal. AnothcT important feature missing in 
current email is support for anonymity and untrace- _ 
ability of users. In the Internet milieu, it is quite un- 
realistic to expect any security features of the under- 
lying network; eavesdroppers can easily record email 
messages and gather addressing information. Tradi- 
tional paper mail, in contrast, allows one to send an 
envelope with a printed destination address and no 
return address. This, coupled with other common- 
sense precautions, can make the sender untraceable 



and anonymous; police and sleuth fiction to the con- ^ 
trary notwithstanding. . 

In this paper we discuss the Koals and desired prop- 
erties of anonymous email and then describe the design _ 
and features of the Babel- an anonymous remailer ^ 
developed at IBM Zurich Research Laboratory. In 
brief, our approach is based on a special entity called 
a "mix". The concept of a mix was first introduced 
by Chaum [2] in the early eighties. A mjx can be^ 
viewed as a logical component (e.g., application layer 
software) that forwards email messages and - in the 
process - obfuscates the relationship between incom- 
ing and outgoing message traffic. ^ 
The paper is organized as follows. In the next sec- t 
tion we begin by motivating the need for anonymity, 
briefly reviewing previous work and describing the 
goals of anonymous/untraceable email. Then, in Sec- 
tion 3 we introduce the concept of a mix and consider 
the threats it faces. Sections 4, 5 and 6 are devoted 
to the technical discussion of Babel anonymous re- 
mailer. Section 7 presents an attempt to quantify 
some measures of anonymity. Finally, Section 8 de- 
scribes the 6alient implementation issues. 

2 Motivation 

It is no surprise that untraceable communication 
is a highly-charged and, at times, even controversial, 
topic |1, 18, 22, 23]. Anonymous email is an anath- 
ema to some people. This reputation is due largely to 
the possible abuses of anonymity for the purposes of 
spreading libelous accusations, hate-filled propaganda, 
pornography and other unpleasant content. 

At the same time, anonymous mail has its legiti- 
mate and benign uses. We divide these into four main 
categories: 1 



1. Discussion of sensitive and personal 

2. Information searches 

3. Freedom of speech in intolerant environments 

4. Polling/Surveying 

Many people in need of counseling or therapy, such 
as victims of sexual, alcohol or drug abuse, can receive 

*The li»t U not meant to be exhaustive. 
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support and counseling electronically while remaining 
anonymous. For example, a victim of abuse would 
probably be reluctant to participate in on-line ther- 
apy sessions if there was a chance that someone they 
knew was "listening*. Thus it is often critical for the 
identity of the user to remain secret. ThiB need is also 
widely recognised by the medical profession. 

We often seek information anonymously in the 
course of our everyday life. For example, an employee 
of one company may inquire about a job opening at 
another (perhaps competing) company; the need lor 
anonymity is obvious. 2 Furthermore, people often 
seek information from sources that, should the identity 
of the seeker become known, would act in a manner 
not agreeable to the seeker. For example, a consumer 
might like to browse a number of electronic shops and 
compare prices before maW a purchase. If the con- 
sumer's identity were revealed, the visited shops could 

eace his name/address on their mailing lists and start 
>mbardinghis mailbox with unwanted "junk* email. 
There are other everyday cases where anonymity is an 
integral part of a transaction. 

On a more somber note, there are still, alas, a num- 
ber of totalitarian regimes in the world; places where 
nonviolent (e.g., verbal) opposition or dissent can have 
serious consequences including imprisonment, torture 
and death. Furthermore, even in the free world, there 
are intolerant and fanatical groups that violently and 
virulently harass critics for mere opinions. Examples 
abound. . . , ... 

In the same vein, there are also many well-known 
situations in which an individual may feel compelled 
to report corruption, criminal behavior or other mis- 
deeds. In such cases, being anonymous means being 
safe from varying degrees of retribution. 

Another useful, albeit rather non-controversial, ap- 
plication of anonymous email is in the area of polling 
and surveying. There are a number of organizations 
specializing in opinion surveys on a wide variety of 
topics. Participants' anonymity is one of the basic 
features of this activity. < 

Admittedly, the fundamental motivation for hiding 
rmp '« identification is the fear of retribution ^ither 
nghtlul or wronglul.J It is not the goal of this paper 
to partake in the currently on-going debate on privacy 
and anonymity on the Internet. We only note that 
anonymity is an optional (and mostly legal) part of 
regular, paper mail. Obviously, it can be misused, 
vet there are no great debates on banning anonymous 
usage of paper mail. Drawing a boundary between use 
and abuse of technology is a complicated philosophical 
matter; it is not treated in this paper. 

2.1 Previous work 

The first and the most authoritative paper to-date 
dealine with anonymous communication was pub- 
lished by D. Chaum in 1981 [2]. The Babel re- 
mailer described in this report owes much to his ideas. 



Chaum also invented the DC-network {4] which pro- 
vides unconditional untraceability commensurate with 
high bandwidth overhead. Pfitzmann and Waidner 
have also done a considerable amount of work on / 
anonymity and untraceable communication in LAN^ 
and ISDN environments. [24, 26, 25, 27]. ^ 



" The oldest andlc^rrentiy ) most widely-used anony 
mous remailer is loc^teoTlniinland. U is cabled Penej 



amnro ^aedTby J. Heisinfiius. Penet pert'orms the 

foUowmgranctionsL_ " T^T - 

irSnpi'^To^ header information of the tncom- 

is c^RTT fTTKe outgoing message the address of the 

ti miaced b$ grag g Tfij ass gl km g h 

^Iplrnll*} o f the messag e to repiv to thfTreal jer?tfgr_ 
^^TrESkmna his identity. \> S 

The demand on the Penet remailer is quite high: 
over 7,000 messages are sent daily. The alias database 
contains 200,000 entries [11]. Recently, Penet has be- * < 
come the subject of some controversy 3 . 

The second brand of remailers are promoted by a 
group called cypherpunks [12]. There are about ?0 
publicly available cypherpunk remailers. These remail- 
ers offer some of the basic functionality described in I" 
this paper. Although they share the same code base, 
each difTers in minor ways; some allow posting to news- 
groups while others do not, some do not accept pgp 
encrypted messages; some even use different formats. ^ 
Their lack of a unified modus operandi complicates 
their use and hinders their acceptance. The hfixmas- 
ter [7] remailer written by L. CottreU is a significant 
step forward as it constitutes the first true mix. 
2.2 Overview of desired properties 

We begin the technical discussion by enumerating 
the desired properties of anonymous mail, , 

1. Anyone able to send e mail shoul d bejible to do 
so anonymously. ~ 

2 It shou ld be imposs ible (or, at least, computation- 
aHyliaH) to determine the originator of anony- 
mous mail. 




The receiver (s) of anon ymous mail _ 
t he sende r , w ho TctrialtiB anonymous. 



reolv In — 
reover. 



3 This example U of proactive job search; it U different 
from the usual reactive search whereby the job descriptions are 
broadcasted to the "masses-, e.g., by posting in appropriate 
newsgroups. 



Iff important w — - — —y yrnr ■ — vi t 

a Sgggggm PB^ftZS l Sv> fle ri nitton, sacrifi ces some ^ 
a nonymity because the original sender "Tcnows 
the intend ed receiver, |al and ^can c orxc l fttfi ft rep l y 



with an earlier message. 4 ) 

3 On February 8. 1995, based on a burglary report filed with 
the Los Angeles police, transmitted by Interpol, Finnish police 
presented Helsingiu. a warrant for search and seizure Bound 
to do so by law. he complied, thereby revealing the electronic 
address of a single user. 

4 However, some degree of anonymity can be preserved. For 
example, a reply to an anonymous newsgroup post only reveal, 
the newgroup to the ori*in*l poster; the identity of th« roplymg 
party remains secret. 
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. « /> 

4 Individual remailere intervening in anonymizing 
messages should be trusted as little as possible. 
The anonymity of the end-points should be pre- 
served even if a number of intervening entities^ 
collude or are subverted. 

5 The remailer infrastructure should be resistant to 
both passive and active attacks. (This property 
is elaborated on below.) 

6. The sender of anonymous email can (anony- ^ 
mously) obtain confirmation that it has been 
properly processed by the remailer system. 

7 Anonymous email should not overload the global 
email infrastructure. (For example, if anonymity 
requires generation of email noise its volume 
should be kept low.) 

2.3 Notation 

The following notation is used throughout the re- 
mainder of the paper: 

M 




EAM\ 

D X \M) 

K{M) 

(MuM*) 

Ax 



[Ml 



n 



message; sequence of ASCII bits 
encryption of M with X's public key 
decryption of M with X's private key 
conventional encryption of M with key K 
concatenation of Mi and 
X's email address, 
padding string M to length 0 
(by appending random bits) 
trimming string M to length 0 
(by removing trailing bits) 



3 MIX - fundamental building block ^ 
As already mentioned, ^ anopymofl* remailyr. or a _ 
' addition to torward 



mixTis a n entity th at, in addition to forwarding incom- 
i iip messages, strives' to h irfo the relafaohamp DttWCCP 
inr ^mTnV and outfioinp message tratnc. (^ee Figure 

1 \n our model we assume the existence of a powerful 
adversary - Eve - capable of recording, removing or 
altering packets entering or leaving a mix. Eve is also 
able to generate spurious messages. 

A mix functions according to the following princi- 
ple [21. Suppose Alice wishes to send message M to 
Bob anonymously. She submits a specially composed 
message / to the mix. / includes M and Bob s net- 
work address. It is intelligible only to the mix. A 
transformed version of /, called O y is forwarded by 
the mix to Bob. Ideally the relation between the in- 
coming message, /, and the outgoing message, O, is 
obfuscated. Thus, Eve is unable to connect Alice to 
Bob. This kind of anonvmity is called "unlmkabihty 
of sender and recipient" [27]. 

There are two ways for Eve to correlate incoming 
and outgoing messages: i) by contents, i.e., message 
data or message size, or, ii) by causality, i.e., by asso- 
ciating time of message arrival with that of its depar- 

In general, content correlation can be addressed by 
using standard cryptojjjjjjjnc^ techniques along with 



- Eve 



Figure 1: Basic Model 



padding. Causal correlation can be easily countered if 
the incoming traffic volume is sufficiently high. In the 
next section we focus on making content and causal 
correlation difficult. 
3,1 Passive attacks 

This section addresses so-called passive attacks, i.e. 
those that can be carried out by merely observing mes- 
sage traffic. 

3.1.1 Content correlation 

Two elements can help in content correlation: ac- 
tual content and length. For prevention it suffices that 
all messages to/from a mix be encrypted and be of 
uniform length. We denote this length by 0. 

The user encrypts his message M and the destina- 
tion address ^ B ob with the mix's public key. Thus, 

/ = £mi*MBob, Af ) where Anob is Bob's network 
address. , 

Upon receipt and successful decryption of J, the 
string Mnob.M) is be revealed. The output mes- 
sage O, consisting of M (in cleartext) and other data 
added by the underlying communications network, is 
forwarded to Bob at ^Bob- Eve mav attempt to cor- 
relate 0 and I by comparing £mi*MBob, A*) and /. 
To outwit Eve, random one-time "salt" must factored 
into the encryption to ensure that successive encryp- 
tions of the same message yield different results. 

In hybrid systems based on both public and con- 
ventional key encryption the random string might be 
unnecessary. Such systems typically use a random ses- 
sion key to encrypt user data with a symmetric key 
algorithm and a public key encryption algorithm to 
encrypt the random session key. Each encryption with 
a public key uses a different session key, which is re- 
vealed only to the owner of the private key (the mix 
in this particular case). Thus, Eve is unable to corre- 
late J and O even though she is able to re-encrypt 
(Abo^M). The re-encryption results in /, which 
bears no resemblance to /; refer to [26] for crypto- 
graphic attacks on straight-RSA implementation of 
mixes. 

In order to avoid size correlation, message sizes 
must be constant throughout the entire mix network. 
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Message size uniformity can be achieved by padding 
to a constant length (fl) with random data. Although 
seemingly innocuous, padding is an important issue 
and greatly influences the implementation of a mix. 
A detailed discussion of this issue is postponed until 
Section 6. 

Note that the security of the system is based on the 
integrity of a mix. In a single-mix architecture, if the 
mix is somehow forced to reveal its private key, identi- 
ties of users can be compromised. Multiple mixes can 
be used to increase the security of the whole system. 
This is discussed in the following sections. 
3.1.2 Time correlation 

Obviously, there is a strong causal relationship be- 
tween the incoming and outgoing messages. This rela- 
tionship can be exploited by Eve. One simple solution 
is to output messages in batches, as outlined in [2]. 
In this scheme, at least N input messages are accu- 
mulated before being forwarded in random order. N 
is called the minimum batch size. We refer to this 
scheme as normal or regular batching. 

Under low load conditions, incoming messages may 
be so scarce that a batch of size N cannot be formed 
within a reasonable time. Sending out random-looking 
decoy messages to random destinations solves (or at 
least alleviates) the problem. Decoys are indistin- 
guishable from normal messages except that they are 
immediately discarded by their recipients after decryp- 
tion. 

In an enhanced scheme, called interval batching, we 
divide time into equal periods of length T. Let n be 
the number of incoming messages in a given period. 
The following procedure is performed at the end of 
each period: 

normal batching if n > N 

N-n decoy, followed by batching if 0 < n < N 

This approach guarantees that a message will be de- 
layed at most T units of time by a mix. Note 
that batching messages introduces a risk because 
anonymity then depends on the behavior of other 
users. This external dependence can pave the way 
for other attacks (see Section 3.2.1.) 

Another popular approach to solving the time cor- 
relation problem involves introducing a random delay 
for each message. This randomness makes the system 
nondeterministic but not necessarily safer. We avoid 
this venue. 

3.2 Active attacks 

In this section we discuss active attacks, i.e. those 
involving direct modifications to message flow, by al- 
tering, inserting, delaying and even deleting, mes- 
sages. 

3.2.1 Isolate & Identify 

If regular batching is used, Eve may submit a num- 
ber of messages to a mix, forming an almost complete 
batch, with only one message missing. Upon arrival 
of a genuine message, the entire batch is forwarded 
and Eve can simply pick out the message she did not 
generate [27]. Note that, although the genuine mes- 
sage may be encrypted, Eve is able to correlate the 



genuine message with its outgoing counterpart. The 
mix is thus considered defeated. 

In the interval-baaed batching approach, flooding 
a mix is useless if genuine traffic is heavy. However, 
when few legitimate messages arrive in a given inter- 
val, flooding causes the mix to believe that decoys are 
unnecessary. Eve might even remove or rearrange mes- 
sages so that one reaTmessage trickles into the mix oer 
period. Then, by injecting false messages Eve is able 
to link the single authentic message with its outgoing 
counterpart. 

This attack is difficult to thwart completely. One 
simple but only partial oountermeasure is to require 
a certain number of decoys even when a batch is full. 
A more effective approach is the introduction of inter- 
mix detours; it is discussed in Section 5.6.1. 

3.2.2 Message Replay 

Eve can try to defeat a mix by recording a gen- 
uine message and reinserting it later into the message 
stream. As an incoming message 1 results in the same 
output O when replayed, associating the two is triv- 
ial. Because of its simplicity, message replay is an 
extremely serious threat. It is possible to prevent re- 
play by keeping track of incoming messages and dis- 
carding replays [2]. Replay detection is a well-studied 
topic [8, 9]. Basic techniques consist of using sequence 
numbers, random numbers (nonces) or data and time 
stamps. 

Techniques involving sequence numbers or nonces 
imply at least some synchronization. However, there is 
an inherent contradiction between the terms synchro- 
nization and anonymity. Moreover, traditional meth- 
ods are concerned with authentication, which is not 
required in our case. Under these circumstances, we 
have decided to use a variant of a time-stamp scheme. 

In brief, each message is uniauely identified and 
time-stamped. Clearly, the identifier should reveal no 
information about the message. Assuming the use of 
hybrid message encryption cryptosystem (e f g., as in 
PEM or PGP) we use the public key encrypted form 
of the session key as the message identifier. Since a 
message does not decrypt correctly even if a single bit 
of the encrypted session key is altered, it is an invari- 
ant of replays. The session key is unique with a high 
degree of probability because, it is usually generated 
at random from a very large key space 5 . This method 
is also very cost-effective since a mix does not have to 
perform any expensive operations to calculate unique 
identifiers for the incoming messages; it simply copies 
the encrypted session key. 

It is certainly undesirable to keep track of messages 
indefinitely as it would result in excessive space usage. 
A simple solution is to time-stamp messages and flush 
message entries after some fixed system- wide tune in- 
terval. This point is further discussed in Section 8. 

Replying to messages is somewhat different because 
"replays" along a reply path are perfectly legitimate 
(see Section 5.6). 



*PQF umi 128 bit EDEA-key*. Moreover, before RSA- 
encryptini thfa H>EA-k«y, it randomly p~J. it to th. modulus 
of the public key. 
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3.3 Cascading or chaining mixes 

We now assume that there is a pool of mixes at the 
users' disposal. As mentioned earlier, if only & single 
mix is used, that mix is trusted to withhold critical 
information. Instead of trusting a single mix, Alice 
may decide to use a series of mixes to forward her 
message to Bob [2], see Figure 2. The system thus 
becomes more secure. 




Figure 2: Chaining mixes 

Eve's task becomes significantly more difficult. In 
fact, in order to link the message sent by Alice to the 
message received by Bob, Eve has to subvert/defeat 
the mixes on the path. 

If Eve is a global observer of the mix network she 
can simply concentrate on messages entering and leav- 
ing the system without paying attention to inter- mix 
traffic. If the traffic load is low, the security degen- 
erates to the worst-case scenario outlined in SecUon 
3.1.2. However, in practice, a large number (> 100) 
of independent mixes distributed around the world 
would make it very difficult for Eve to be a global 
observer. 

4 Forward Path 

In this section we describe the process of generating 
anonymous messages and their subsequent handling 
at intervening mixes. Most of the material (with few 
exceptions) presented in this section is due to Chaum 

[2, 3, 4J. , r 4 . 

For the sake of clarity, we assume (for the 
time being) that cryptographic operations (encryp- 
tion/decryption) have no impact on message size. We 
will return to this issue in Section 6. 
4.1 Composition by sender 

Suppose Alice wishes to send an anonymous mes- 
sage to Bob through / mixes; Fi.Fj, . . . , F/. This 
set of mixes is referred to as the forward path 6 . She 
composes her message according to the following pro- 
cedure: 

(1) The cieartext message is padded to exactly 0 
bytes The maximum allowed cieartext message 
size is, a, where a < Q. This restriction en- 
sures that each message is padded with at least 
8 = CI- a random bytes. The reason for reserving 
0 bytes for padding will become clear in Section 



6. The parameters O and a are system- wide con- 
stants. 

(2) The padded message f M\ n is then encrypted 
once for every mix on the forward path, starting 
with the last, Fj, is encrypted in the following 
manner: n 

where £f< represents public key encryption with 
mix Fi's xey. The final outcome is: 

*BKiA* i • B n ■ f M l 0)) * * 0) 
The result is analogous to an onion where each 
encryption is likened to a layer of skin. To access 
inner layers, outer layers must be stripped oft first 
(see Figure 3.) The effect of encryption on mes- 
sage size is shown in the figure. In particular, the 
dimensions of the boxes show how message size 
increases with each encryption and concatenation 
step. 



caayptiMwitbFriby 




MayptiMvttRf-U'ikey 
eeayptbc with Pf 'i kry 



Figure 3: Forward message prepared by Alice 

(3) Once the onion is assembled it is sent to the first 
mix on the forward path, F\. 7 

Note that the encryption steps are all performed 
at the sender, the only trusted entity. No encryption 
takes place at the successive mixes. This ensures that 
the information revealed to each mix is kept to a min- 
imum. 

4.2 Processing by mixes 

The first remailer, upon reception of Alice's mes- 
sage, decrypts it with its secret key to discover the 
address of the next hop, Ak\ This is analogous to 
removing the first of skin of trie onion. 

Similarly, each mix on the forward path removes 
a layer of encryption until the last hop is reached. 
The last remailer strips ofT the remaining layer and 

T Thc binary data produced by encryption might be unsuit- 
able for transmission a* email. In that case, an appropriate 



*Weu.e the letter F ,o denote mixes on the forward path. format conversion must take place. 
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discovers v4boI>- The message is then delivered with 
all padding removed. 

The actual message received by Bob shows that it 
has been delivered by mix F/ . Bob does not know the 
identities of the other mixes nor that of the originator, 
Alice 8 . 

To accommodate the largest possible number of 
users, the system of remailers assumes that the re- 
cipient of anonymous messages has only basic email 
capability with no specific software to handle anony- 
mous messages. Only regular mail is delivered to the 
final destination. In other words the last mix sees the 
message in cleartext. 

If the destination has encryption capability (e.g., 
PGP) Alice can encrypt the message using the recip- 
ient *s public key. Thus, the contents of the message 
are hidden from the last hop, and a higher degree of 
security is achieved. Obviously, if the destination is a 
public newsgroup, using secret keys make little sense. 

4.3 What does a mix know? 

One important security measure of the entire mix 
network is the amount of knowledge gained by a mix 
in the course of processing a message. By examining 
email-specific fields (e.g., SMTP headers) an interme- 
diate mix on the forward path can discover the identity 
of the previous mix hop. Without some questionable 
hacking of email software it appears impossible to pre- 
vent a mix from gaining this knowledge. 

Another piece of information visible to a mix is the 
identity of the next hop. It is possible - albeit in 
theory - to prevent an intermediate mix from knowing 
the next hop. 0 We briefly sketch one simple method: 

Alice composes the anonymous message much as 
before but omits mix addresses - Af< " from the layers. 
Bach intervening mix, instead of sending the message 
to the next hop, posts it to a newsgroup periodically 
scanned by all mixes. (An alternative is to broadcast 
the message to all mixes.) All mixes try to decrypt the 
message but only one succeeds. The same procedure 
is repeated until the last mix is reached; the last mix 
forwards the message directly to Bob. 

Although it halves the knowledge gained by inter- 
mediate mixes this solution is fraught with difficulties: 
the performance overhead alone would be staggering. 
A more practical, but commensurately scaled down, 
variation is to give the sender an option to include 
multiple mix addresses in each layer. This way, an 
intermediate mix forwards an outbound message to 
several next-hop mixes and remains uncertain with re- 
spect to the identity of the actual next hop. 

5 Return Path 

Thus far we discussed how to send messages 
anonymously without enabling replies. Although 
uni-directional communication is most amenable to 
anonymity, it is sometimes desirable for an anonymous 
mail recipient to reply to the (still anonymous) sender. 
This can be achieved by giving the sender an option 



of including a Return Path Information (RPI) in the 

anonymous message. 

5.1 Creating the RPI 

The RPI is composed by Alice according to the fol- 
lowing procedure. 

(1) Alice chooses mixes R lt R 2 , . . . , Rr for the return 
path and mixes Fi, F 7% . . Fj for the forward 
path. (See Figure 4.) 





Alice 




(2) 



(3) 



(4) 



(5) 



* Utiles* of courte the 
signature. 

•Note that Utile can be done in < 



bears Alice's name t 
se of the last hop mix. 



Figure 4: Return Path 

The mixes on the forward path and return path 
are completely independent. The two sets may 
be identical, overlapping or completely disjoint. 

Alice randomly chooses a key seed - KS - 
and, using it, computes r keys, K\ , K% y . . . , K T . 
There are many ways to do so, e.g.: Ki = 
E{KS,i)for\ < i < r These keys will be used 
by the return mixes to encrypt Bob's reply. 

The key seed (along with the number of hops r) 
is first encrypted with Alice's public key to form 
Vo = ^Alice(A5,r). 

Then, once for every mix on the return path, 
starting with the last, Rr t the following encryp- 
tion is performed: 

Vi = (^H r - i+ i»^Jl f .- <+l (^r-<+l.W-l)) 

(for 1 < t < r) 

The final outcome is: 

. . .£* r (Ar,-4Alice, JWe(* £ r)) . . .)). 

We refer to the resultant block, shown in Figure 
5, as a little onion, similar in construction, but 
Bmaller than, the forward-path onion. 

Alice inserts the resulting RPI block into the be- 
ginning of the cleartext message she wishes to 
send. Then the procedure outlined in the pre- 
vious section is followed until the last remailer on 
the forward path, F/> is reached. Fj detects the 
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ofa flwjpdcn 

enoypbcdrtfcmRr'jkcy 
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Figure 5: Return Path Information 

RPI in the outbound message and modifies the 
mail header such that a later reply by Bob would 
be sent directly to Hi and not to F*r (We as- 
sume that the RPI is "visible" to the last hop mix. 
It would be more secure to encrypt RPI for the 
destination - Bob - but we try to avoid requiring 
any cryptographic capability from Bob.) 

5.2 Replying by recipient 

As mentioned above, the message sent by Alice and 
received by Bob is prefixed with an RPI block. The 
RPI is meant to be treated opaquely by Bob 

Bob composes hi* reply as usual and simply 
prepends the RPI he received from Alice. He then 
sends his reply to the first mix on the return path, 
namely R\. ^ . _ „ 

The message received by Hi is shown in Figure 0. 




Figure 6: Bob's reply as received by R\ 

5.3 Reply processing by remailers 

TWn r^ivipfl Rob's reply, fti detects ft? IPCjwffi 
RPTand extracts jk^LfiEus.denote JiblS original RPI 
by-ltP^T^^^^. mx on the .retyrn. path^Si , 
perform^ 

10 In RFC-822-comp»tible systems, this it achieved by includ- 
ing a "Reply-To" field [6] in the header of the message sent to 
Bob. 



(1) Combine the header and body of the reply (with- 
out ^he RPI) into a string M'. This is the string 
that will ultimately reach Alice. 

(2) Pad M 1 to size ft- w. 

(3) Decrypt RPI 0 , to reveal the random key K x and 
1 -4« aI the address of R 7 . Let RPIi denote the 

new 11 RPI, which has one fewer layer of encryp- 
tion. 

(4) Encrypt \M f f" w with tfi to form Y x = 
K x {\Htadtr + Body]"""}. 

(5) Send (RPIi,Yi) to R*. Note that the size of this 
message is ft. 

The next r - 1 remailers on the return path will 
perform a similar operation. At mix ft,-: 

(1) After reception of (RPIf-i, Y<-i) decrypt RPI—i 
to reveal Ai+x and A",-. The resultant value is 
denoted RPIj. 

(2) Encrypt YJ-i by Ki to form 

(3) Send (RPI*, Yi) to the next hop Ar^. 

For the last mix on the return path, the operation 
is identical except that the next hop's address will be 
A Alice instead of An r+i . . . 

It is important to note that a reply message is in- 
distinguishable from a message on the forward path 
because both have size fl. The structure of both mes- 
sages look identical to an outside observer, i.e. en- 
crypted gibberish. 

A mix is able to determine whether a message be- 
longs to the forward or reply flows by performing at 
most two decryption attempts. 

If decryption of first w bytes is successful then the 
message is on the reply path. 

Otherwise, the message is on the forward path and 
the decryption of the entire message, Q bytes, should 
be successful 13 

6.4 Handling replies at the originator 

Eventually, Alice receives the string (RPI r ,Yr) 
Bob's reply. However, by this time, all layers of en- 
cryption have been removed except the last. Therefore 
Alice sees: 

RPIr = EAUc^KS.r) and 

Y r = Kr{Kr-il..K x {\M'])...}) 

Decrypting RPI r reveals KS and r and allows Al- 
ice to regenerate K\ . ..K r . Successive decryptions 
of Y r with these keys yield M'. We note that in 
Chaum's model [2], Alice has to remember the keys 
K u K 7t . . . , K r , in order to process the reply. In our 

"Amine for the time being that the sUe of this new RPI is 
iliUw. 

>*If both encryption attempt* fail the mewge u discarded. 
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scheme, keys are embedded in the reply, considerably / 
simplifying the processing and allowing Alice to re- 
main stateless with respect to outstanding messages. 

Note that NT is composed of Bob's reply, and its 
header as seen by the first mix. This header can be 4 -, 
used to identify Bob. Thus, a reply to an anonymous 

( equaily»aauaPi9cnaus — , , , , , 



IpS Two-way Anonymous Goiversation 
i Despite" ln& a o6Ve, It is posaiDie, under some conST"^ 



Hons, for Alice and Bob to communicate anpnroiqusly. 
in both directions. Suppose fliaX Alice begins by send- 
in gan 

RPI . Sin ce Bob does not £noW;^ce^^ 

cate with Alice. He sends his reply^M^ to Hi (1st hop ' 
in RPI) anonymously through 'mixes" 'Xi]7C2 l " « \ X- 
fsee Figure 7). -In other words, Bob creates his own 
forward path and connects it to Alice's RPI. 



rWr 



Alice 



Bob 



Figure 7: Bob's anonymous reply to Alice. 

Bob can also include^an RPI in^his message so that 
Alice can'reply to BobV anon^buB re^ through yeW*- 
another series of mixes, see Figure 8. T&juSjJf is pos- 
sible for two parties to communicate electronically in 
both directions without either party knowing the iden- 
tity of the other. 

5.6 Security of replies 

Unfortunately, it is difficult (if not impossible) to 
apply similar replay detection measures to replies as 
to forward-bound messages. This is because it is per- 
fectly legitimate for multiple recipients to generate 
several responses to a single anonymous message, (this 
holds only if Alice explicitly allows replies by including 
an RPI block.) 

Thus, Eve can mount a replay attack. Note, how- 
ever, that in cases where replies are not wanted, 



■-isP / 




Figure 8: Alice replying to Bob's anonymous reply. 



Babel allows messages to be sent without an embed- 
ded RPI. Also, an RPI is not "tied" to a given sender; 
it is trivial to create an RPI with a fake return ad- 
dress. In other words, since RPI-s are not digitally 
signed, they can be repudiated. 

5.6.1 Inter-Mix Detours 

A simple yet powerful way of strengthening the se- 
curity (i.e., untraceability) of replies by introducing 
inter-mix detours. 

Let R u fl a , . . . , Rr denote the mixes on the return 
path. Normally a mix H, (0 < i < r) forwarded 
the reply to the next hop Rj^i- In the detour mode, 
Ri chooses a random forward path (called & detour) 
D\ , jDj, . . . , D*.. , which consists of normal mixes drawn 
from the global mix network. The message is then 
anonymously forwarded to through these mixes 
as shown in Figure 9. 

There is nothing special about detour-ed messages; 
they are a regular anonymous messages only con- 
structed by a mix and not a user. 

A detour ensures that a message leaving Ri ap- 
pears different for each reply, in particular, during 
a replay attack. Compare this to the previous case 
where replies to the same anonymous message can be 
correlated by merely examining the exposed RPI. 

Messages on the forward path could also be de- 
toured. However, if the mixes on the deviated path 
further detoured messages, endless detour loops would 
occur. To avoid this problem, detoured messages 
would have to be tagged accordingly. An important 
benefit that can be derived from detouring forward- 
bound messages is that, unlike Chaum's mixes [2], we 
can guarantee that even the originator of an anony- 
mous message can not recognize its own message as it 
leaves a mix. 

One slight drawback of introducing inter-mix de- 
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Figure 9: Inter-mix detours on replies. 



tours is that a mix now has to know about other mixes; 
thus far, it has not been a requirement. 
5.6,2 Indirect replies 

An entirely different approach to replies can also be 
envisaged: instead of delivering a reply directly to Al- 
ice, Bob can deliver it to a local newsgroup with a spe- 
cial number tag. Alice scans this newsgroup for replies 
matching that number tag. This method is roughly 
analogous to the broadcast solution as described in 
[10). 

6 Keeping Message Sizes Constant 

In principle, a cryptosystem where encrypted mes- 
sages are of the same length as the corresponding 
cleartext can be devised, e.g. CFB mode of DES 
with a pre-distributed initialization vector. In prac- 
tice, however, ciphertext is usually somewhat longer 
than cleartext. In hybrid-key cryptosystems the size 
increase is particularly noticeable due to the need to 
include an encrypted random session key in addition 
to the ciphertext. Conversely, decryption results in a 
shorter message. 

Thus, the length of an email message would de- 
crease after each decryption as it travels through the 
mixes. The differences in size can be exploited by Eve. 

The problem can be solved if each mix pads the 
outgoing message to fi. Although all messages would 
have the same size for an eavesdropper, the decrease 
would still visible to remailers. This allows them to 



make educated guesses as to the number of preceding 
or following hops, and is contrary to one of our goals 
set in Section 2.2. 

Each mix should know only the identity of the pre- 
vious and next hop and nothing else about the path of 
a message. The first and last hops are a little different 
because they can learn the identity of the sender and 
the recipient, respectively. 

Furthermore, the number of preceding and follow- 
ing hops should be kept secret. Although the message 
sent by Alice is indistinguishable from other inter-mix 
traffic, the first hop can infer that it is the first hop 
by comparing Alice's address with the list of known 
mixes. In a similar fashion, the last hop can deduce 
that it is the last. However, all others, i.e. interme- 
diate hops, should not know the number of preceding 
hops nor the number of following ones. 

Chaum [2] presents a general solution where data 
is divided into a fixed number of fixed-size blocks. 
This is the solution implemented in the Mixmaster 
package[7]. 

Here we present another approach that is simpler 
and more storage-efficient. The basic idea is to ensure 
that some padding (encrypted or not) always follows 
information-carrying data. An example should make 
the point clear. 

Let string C of length Q be composed of M bytes 
of data followed by P = O — M bytes of padding. Also 
suppose the encrypted version of C is denoted by C 
having length 0 + d. If 6 < P then trimming J trailing 
bytes of C has no impact on the encrypted version 
of the data but only on the encrypted version of the 
padding, see Figure 10. In other words, trimming S 
bytes results merely in the loss of the original padding 
but not in data loss. 




Figure 10: Padding — Encrypting — Trimming 

For the previous statement to hold, the encryption 
algorithm should be such that correct decryption of 
a given block depends on some or all of the previous 
blocks but not on following blocks. This is true for 
most encryption algorithms. We also note that if the 
encryption package used embeds CRC or length infor- 
mation about the cleartext, alterations made to the 
ciphertext will be detected, leading to possible rejec- 



10 



JSDOCID: <XP 2086536A I > 



tion of the message. This issue is further discussed in 
Section 8.2.2. 

7 Heeding anonymity 

In the preceding sections we defined the notion 
of a mix, the potential threats facing it and re- 
quirements for constructing mixes that provide bi- 
directional anonymity. This section attempts to for- 
malize and analyze the degree of anonymity a par- 
ticular remailer system provides. In particular, the 
notions of confusion and staunchness are introduced 
and defined. 

7.1 Fixed-Path Systems 

Until now we made an assumption that a mix path 
is chosen at random from a large poo) of available 
mixes. This should not necessarily be so. An inter- 
esting way to increase the overall traffic load is to use 
the same fixed mix path for all messages [24]. We 
denote this path by M\ , A/2, . . . , M m . In this config- 
uration, messages always enter the system at Mi, are 
forwarded to Afa, then to the next mix, and so on until 
they leave the system at M m - 

By forcing all messages to visit all mixes pertaining 
to the fixed path, the traffic going through each is 
maximal. There are other advantages of using a fixed 
path. The mix network becomes more reliable, less 
chaotic and much easier to manage. 

Maximizing traffic load might seem contrary to 
good engineering practices. Clearly, if a mix is over- 
whelmed by sheer traffic volume, data loss can occur. 
This is not a serious drawback because, as the traffic 
increases beyond the processing capacity of the mixes, 
other fixed paths can be introduced to offload the pre- 
vious fixed path(s). 

Owing to practical considerations, the number of 
mixes on the fixed path, m, is clearly limited. Thus, 
the advantage of using a large number of mixes is lost. 
Now it is much easier for Eve to monitor the entire sys- 
tem. She can even learn a great deal by watching only 
the first and last mixes, M\ and Af m . Consider the fol- 
lowing attack where Eve allows only a single message 
to trickle into a interval batching mix network. 




Figure 11: The Trickle Attack 

This attack is referred to as the trickle attack be- 
cause Eve allows only a single genuine message to enter 
the system. By observing the output of the last mix, 
Mm. she can correctly correlate the genuine message 
with its corresponding output. 



Decoys might be used to outfox the trickle attack. 
However, as many users would be alarmed or even 
upset by receiving decoy messages, we do not allow 
them to leave the mix network. Unfortunately, inter- 
mix decoys do not confuse Eve. 

7.2 System staunchness, miss & guess fac- 
tors 

We define the miss factor , denoted M> for a mix 
network as the probability of making an incorrect cor- 
relation between a message entering the mix network 
and a message leaving it. It represents the measure 
of confusion introduced by the mixes. Similarly, the 
guess factor, denoted £, is defined as the probability of 
making a correct correlation. (Obviously, Q +M = 1.) 

Consider the fixed-path case where the intervening 
mixes use regular batching with the batch size set to 
N. Then, Q for the fixed path is equal to 1/N. It 
is interesting to note that the result is identical to 
the guess factor of a single mix. What is then the 
advantage of chaining through several mixes? 

A chain of mixes is more secure than a single mix 
because Eve has to subvert all mixes in order to break 
the anonymity chain. In other words, a chain of mixes 
is more secure than a single mix but not necessarily 
more confusing. We define the staunchness, 5, of a 
mix network as the number of secret keys needed to 
defeat message anonymity. In all schemes described 
this far, staunchness is equal to the number of mixes 
a message travels through. 

7.3 The Quest for Confusion 

Consider the fixed path case where intervening 
mixes use interval-based batching instead of regular 
batching. Assuming the clocks of remailers are per- 
fectly synchronized and message transmission time is 
small but non-zero 13 , the itinerary of a group of mes- 
sages arriving during interval t is depicted in Figure 
12. 

A message entering the system at interval i will 
leave it at time (m • T). along with the rest of the 
messages entered during the same interval. The guess 
factor for period t is given by 




where n< is the number of messages entering the sys- 
tem in interval t. 

Thus, if few messages enter the system, the prob- 
ability for correct correlation is close to one. This is 
what one would expect by intuition. 

For obvious reasons, the higher the value of the 
miss factor, the better. One could simply increase the 
duration of the interval, T, to augment the average 
number of incoming messages per period. However, 
this has a negative impact on the average delay ex- 
perienced by messages. They will be delayed on the 
average by T/2 in the first mix and for a full period at 
the following mixes. Thus, the total average delay for 



,s So that mcMAft* arrive at the following mix during a new 
interval. 
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Figure 12: Interval batching with synchronized clocks 



the fixed path, neglecting transmission and processing 
time, is given by 

E[Dclay) = T{^+m-l) [sec] 

where m is the number of remailers on the fixed path. 

7.3.1 Probabilistic deferment 

Continuing our pursuit of confusion, we now intro- 
duce a new scheme based on the time interval method 
but with an added twist. The "twist" is that at the end 
of each time interval, some of the incoming messages 
are deferred for an additional time period while all 
other messages are sent with no further delay 14 . We 
refer to this scheme as probabilistic deferment with 
interval batching. 

The decision to defer a given incoming message is 
taken by flipping a biased coin. Let q be the probabil- 
ity of forwarding the message at the end of the current 
interval and d = 1 - q the probability of deferring it 
for an additional period. 

Let the random variable K denote the number of 
times a given message leaving the mix system has been 
deferred. The probability mass function of K is given 
by 



?{K = k) 



- (?) 



q m ~ k d k where Jt = 0, . .. ,m, 



14 Incoming and deferred messages are distinguished by keep- 
ing appropriate state information. 



which is the binomial distribution. The expected value 
of K is simply 

E[K] = md 

Thus, with the new scheme, a message on the av- 
erage will be delayed by: 

£[delay] = T(^+m-l)+ T * m ♦ d [sec] 

avg addtl delay 

Note that in the worst case a message may be de- 
layed as long as 2Tm seconds; delayed tor a full inter- 
val and also deferred on all m mixes. 

With the new scheduling policy the opponent has 
to guess both the interval to which a message belongs 
(i.e k) and also its position in that interval. Presuming 
that the number of messages arriving at each period 
is roughly the same 15 , Eve's best guess is to assume 
the most likely deferment event. 

Thus the guess factor for the new policy for the 
interval i, designated ft, is given by the guess factor 
for simple interval batching, times the probability of 
the most likely deferment event, i.e. 

§ { = Gi - P{most likely k} 

For a binomial variable B t with parameters (m, d) t 
where 0 < d < 1, aa 6 goes from 0 to m, P{B = 6} 
first increases monotonically and then decreases mono- 
tonically, reaching its largest value for 16 fE[BT], the 
smallest integer greater than or equal to m • a. For 
a rigorous proof, refer to [34]. For a less rigorous 
but amusing proof, the reader can approximate the 
binomial by the Poisson distribution, generalize the 
factorial to the gamma function 17 and then take the 
derivative with respect to a now continuous b. 

Figure 13 shows the probability of the most likely 
event, P{2? = fE[&]"|}, as a function of the deferment 
probability d for even and odd values of m. 

Clearly, for odd values of m, the probability of the 
most likely event is minimal for d = 1/2. This is a little 
different for even values of m, for which the minimum 
is reached for values of d not too far away from 1/2. 

For a numeric example, suppose m = 5 and d = a = 
1/2. The most likely value for it is 3, with probability 
i|. Thus, the probabilistic deferment method intro- 
duces an additional uncertainty of |§. If we had sim- 
ply doubled the time interval to T = 2 • T, as to have 
the same delay in the worst case, then the decrease in 
the guess factor would be only 1/2. The probabilistic 
deferment method compares well with simple interval 
batching for all values of m > 1, even in the worst 
case. 

"Otherwise, messages are likely to belong to the most popu- 
lated interval. 

,e "E B does not mean encryption here. 

,7 Like the exponential function the gamma function is also 
equal to itaelf when derived. However, it is only defined for H+ . 
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Figure 13: Probability of the most likely b as a func- 
tion of d 



moet Unix platforms and is well-suited for processing 
loosely structured data such as email messaged. We 
opted for the latest incarnation of Perl, version 5. 

We recognize that there is an inherent performance 
cost involved in using an interpreted language such as 
Perl. However, the impact of interpreting the code 
at run time is negligible compared to that of cryp- 
tographic operations, which are notoriously costly m 
terms of processing power. 
8,2 Pretty Good Privacy or PGP 

It being the most popular email encryption soft- 
ware, we chose POP to provide the cryptographic base, 
pop combines the convenience and security of public- 
key algorithms with the high speed of conventional 
cryptography. It offers full-blown message privacy and 
aXntication, based on RSA [30] and IDEA [14, 15]. 

Since it was designed with the mass appeal in mind, 
pap is well-suited for interactive use. Unfortunately, 
this is not the case for automated (batch) processing; 
error conditions require unexpected user interaction, 
and the return codes are at times confusing. 
8.2.1 PGP file format 

With pgp, an email message can be compressed, 
encrypted and signed, but the user can view its con- 
tents and verify its signature with a single command. 
At the byte level this is achieved by embedding a com- 
pressed packet inside a hybrid RSA-IDEA encrypted 
packet. This packet itself is then embedded in aaimi- 
ture packet (Figure 14) which can in turn be embedded 
in a radix-64 ASCII armor. 




7,3.2 A Hybrid Approach 

A hybrid configuration, referred to as fixed-set ran- 
dom order path, imposes a fixed set of mixes but allows 
traversing them in any order chosen at random, with 
each mix visited only once. As with the fixed path 
method, the traffic load is optimal. However, there 
are no critical lines. Eve must observe and control all 
communications lines to defeat the mix network. The 
probabilistic deferment approach can also be put to 
use to increase the confusion factor further. 

It is very difficult to calculate the confusion factor 
for the fixed-set random order system. However, it 
combines some of the best features of the methods 
mentioned so far. 

8 Implementation 

An anonymous reroailer conforming to the ideas 
and requirements described in this paper has been im- 
plemented the IBM Zurich Research Laboratory dur- 
ing first half of 1995. This section discusses some of 
the salient aspects of the implementation. 
8.1 Computing environment 

The popular script language Perl [32, 33] was used 
to implement BABEL. Perl is readily available on 




Figure 14: Multiple Packet Embedding 

pgp recursively processes each packet type until an 
unknown type, i.e. user data, is encountered. Al- 
though this might be the correct behavior at the user 
level/ it is inadequate when multiple encryption is 
used! In that case, POP 18 attempts to continue de- 
crypting after a first successful decryption. The sec- 
ond decryption operation will usually fail because the 
secret key needed to perform the operation will be 
missing (a mix does not know the secret key of other 
mixes). 

>■ Behavior observed with the "+force" option required for In 
batch processing. 
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Furthermore, PGP is meant to be used for email 
privacy and authentication but not sender anonymity. 
PEM is even worse in this respect, as the unencrypted 
PEM message headers contain identification of both 
sender and recipient [31 J- The cleartext part of pgp 
message headers also contains sensitive information 
that can be used by an attacker to correlate mes- 
sages. This potential threat was carefully studied, and 
a version-independent POP format parser was devel- 
oped at the earlier stages of the project. 




Figure 15: Data format for encrypted file 

8.2.2 Side effects of encryption 

By default, PGP attempts to compress cleartext be- 
fore encrypting it 19 . However, since uniform message 
size is a concern, compression is alwayB turned off. 
This prevents messages from shrinking. 

There is another reason for turning off compression. 
In compression mode, pgp adds a CRC of the clear- 
text into the ciphertext. This causes PGP to reject 
files altered in any way, particularly trimmed files. As 
mentioned in Section 6, trimming is used to enforce 
uniform message size. Fortunately, when compression 
is turned off, PGP records only the length of the clear- 
text message. Thus, alterations to data length are 
detected but not those to contents. 

To be precise, pop rejects messages that are shorter 
than the prerecorded value, but accepts longer ones. 
This behavior can be explained by considering that, 
in an email message that includes PGP ciphertext, the 
cleartext (e.g., mail headers) usually precedes the PGP 
part of the message. Moreover, additional cleartext 
(e.g. a cleartext signature) usually follows the pgp ci- 
phertext. We capitalize on this behavior to implement 

19 Some think that compression enhances security; we do not. 



the forward and reply messages indistinguishably, as 
presented in Section 5.3. 

8.2,3 Radix-64 format 

As PGP-encrypted files are in binary format, some 
sort of conversion must take place to send encrypted 
data over 7-bit channels such as email. A remarkably 
simple and efficient conversion method is radix-64 ar- 
moring. It is denned in [16]. 
8.3 Remailer deployment 

A Babel mix is designed to act as a filter installed 
in the .forward file. Refer to [5] for further information 
on email filters. Any user can transform his computer 
account into an anonymous remailer in a matter of 
minutes, without having any administrator privileges. 
Personal email is treated as usual, but anonymous 
mail is filtered and processed without ever cluttering 
the user's mailbox 20 . This is compatible with the In- 
ternet's populist philosophy. However, note that this 
paves the way for a security breach. Since Babel is 
designed with a minimum of human intervention in 
mind, the password needed to access the secret key of 
a remailer is stored in cleartext, in a read-protected 
configuration file. Although this file is not accessible 
to a casual user, the system administrator can usually 
override the safeguards. Furthermore, a popular re- 
mailer site can attract swarms of messages. This can 
result in serious performance degradation on the local 
host. 

The actual deployment of Babel has been delayed 
due to U.S. export restrictions on cryptographic para- 
phernalia. Restrictions apply not only to cryptog- 
raphy per se but also to equipment that makes use 
of cryptography. In particular, although BABEL does 
not contain a single line of cryptographic code and re- 
lies completely on PGP it is still subject to the afore- 
mentioned export restrictions. 

8.4 Proxies 

In order to appeal to the greatest number of users, 
BABEL offers a so-called proxy mode of operation. In 
this mode, a user with no Babel software can ask any 
mix to compose and forward an anonymous message 
on the user's behalf. The proxy mix is also able to sub- 
stitute itself for the user in order to process multiply 
encrypted replies. Consequently, it is possible for any 
bare-bones email user to send anonymous messages 
and receive replies. 21 

The proxy mode of operations is somewhat less se- 
cure because traffic to the proxy mix flows in cleartext. 
However, users equipped PGP but no Babel software 
can send their orders encrypted with the proxy mix's 
public key. 

8.5 Message length — Concrete values 
The Internet email "bible", RFC 1123 [13], speci- 
fies that any mailer software should be able to send 
and receive messages at least 64 Kbytes in length (in- 
cluding header). Taking into account a 33% increase 
of radix-64 armoring, the maximum uniform message 



30 Unless an error occurs while processing the message. 
21 Thii is particularly applicable to non-Unix users. 
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size, ft, we could safely adopt is 48 Kbytes 23 . Being 
concerned by network bandwidth, we opted for half 
that number, i.e. 24 Kbytes. 

For a 512-bit public key, pop increases message size 
by about 115 bytes at each encryption. Experiments 
show that the thickness of a layer of the anonymous 
onion is on average approximately 220 bytes. There- 
fore, when 2 Kbytes of padding are used, a message 
can safely include nine layers of encryption. The rec- 
ommended RPI size td is 1.5 Kbytes. This allows ap- 
proximately seven mixes on the return path. 

We intentionally chose not to provide support for 
larger files. This is the accepted practice on exist- 
ing remailers. It is meant to frustrate the anonymous 
transmission of graphic files, which tend to be very 
large 23 . It 19 still possible to split larger files into 
smaller pieces and send them anonymously. 

8.6 Time Synchronization & Replay De* 
1 h 11 — — 

As mentioned in Section 3.2.2, each layer of the 
ojnonCTeated^ .bj^ Alice JncludlSfl a iime_sJ5gj5I JJLfeC. 
value pXlJxetiiac jj&CPJSj^jerre d to as ©jjsthe num- 
ber of seconds elapsed in seconds since January I ; 1970 
GMT, to the moment of message composition by the 
sender. 

A Babel mix uses a two-step replay detection. 
First, it records a unique identifier of the message as 
described in Section 3.2.2. As long as the record is 
in the database, replays are detected. However, in or- 
der to keep the database size reasonable, the record 
is deleted at time (9 + A). Thereafter, any message 
bearing the timestamp 8 or older will be discarded as 
being too old; not necessarily for being a replay. 

Time stamps are introduced merely to keep the re- 
play database small. Thus, only loose clock synchro- 
nization is needed. Assuming the total delay experi- 
enced by messages at remailers to be about one hour, 
we chose A to be 24 hours, one order of magnitude 
larger than message the delay. Thus, the time it takes 
to visit all mixes on the forward path is considered 
negligible with respect to A. 

With such a coarse value of A, it is sufficient that 
hosts keep clocks accurate within a day for the system 
to function properly. 

9 Conclusions 

This paper presented an anonymous remailer sys- 
tem called Babel . Babel is flexible enough to allow 
both sending and receiving anonymous electronic mes- 
sages. Anonymity criteria have been defined in order 
to compare degrees of anonymity provided by various 
configurations. 

The basic components of Babel, mixes, are not 
aware of each other and learn very little about mes- 
sages they process. In contrast to some currently- 
operating remailers, Babel mixes do not depend on 
(potentially treacherous) alias tables. 

The software implementation of Babel is based on 
freely available ingredients: Perl and PGP. At the 
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same time, the system remains accessible to users with 
only a basic email capability through the use of its 
proxy mode. 

A Babel mix can be very easily set up by any 
user having only a simple Unix account. However, 
it is envisaged that setting up an Internet-wide mix 
network (mesh) will take some time. 

As witn any new technology, some abuse is unavoid- 
able. Caveat Emptor! 
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